Towards a corpus-based dictionary of German noun-verb collocations

نویسنده

  • Ulrich Heid
چکیده

We 1 describe our attempts to automatically extract raw material for a dictionary of German noun-verb collocations from large corpora of newspaper text. Such a dictionary should be about collocations and it should include a description of their linguistic properties, rather than listing the mere lexical cooccurrence. Since most statistical collocation nding tools do not provide other than lexical cooc-currence information, we rst use symbolic extraction tools, based on a regular grammar over part-of-speech tagged and lemmatized text, and we use statistical lters thereafter. We rst list the types of information which should be contained in a collocational dictionary for Natural Language Processing, then sketch our extraction methods and nally discuss and illustrate our initial results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Verb-Noun Collocation SyntLex Dictionary: Corpus-Based Approach

The project presented here is a part of a long term research program aiming at a full lexicon grammar for Polish (SyntLex). The main concern of this project is computer-assisted acquisition and morpho-syntactic description of verb-noun collocations in Polish. We present methodology and resources obtained in three main project phases which are: dictionary-based acquisition of collocation lexicon...

متن کامل

Collocations of Complex Nouns: Evidence for Lexicalisation

This paper combines a corpus-based study of noun+verb collocations with an attempt to distinguish compositional, regularly formed compounds from lexicalised ones. We claim that morphologically regular, compositional compounds share most of their collocational preferences with their compound heads, whereas lexicalised compounds have their own collocational preferences, distinct or only marginall...

متن کامل

Towards Distributional Semantics-based Classification of Collocations for Collocation Dictionaries

Automatic acquisition of raw source material is of great aid for the compilation of dictionaries, and, in particular, of specialized dictionaries such as collocation dictionaries. The extraction of collocations from corpora has been actively worked on since the late eighties. The quality of the state-of-the-art extraction algorithms allows the lexicographers to obtain lists of collocations they...

متن کامل

Extraction of V-N-Collocations from Text Corpora: A Feasibility Study for German

The usefulness of a statistical approach suggested by Church and Hanks (1989) is evaluated for the extraction of verb-noun (V-N) collocations from German text corpora. Some motivations for the extraction of V-N collocations from corpora are given and a couple of differences concerning the German language are mentioned that have implications on the applicability of extraction methods developed f...

متن کامل

Japanese Learners’dictionary of I-adjective-noun Collocations

This paper demonstrates a method for creating Japanese learners dictionary of i-adjective-noun collocations. After an introduction of the importance of collocations and the necessity of their inclusion in Japanese language learning, we present various corpora types and corpus query tools that are used to obtain variety of collocational usage in different types of discourse. The Japanese languag...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998